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Abstract 

We continue our study of the Johnson-Lindenstrauss lemma and its connection to circulant 
matrices started in [5], We reduce the bound on k from k = 0(e~ 2 log 3 n) proven there to 
k = 0(e~ 2 log 2 n). Our technique differs essentially from the one used in [5]- We employ the 
discrete Fourier transform and singular value decomposition to deal with the dependency caused 
by the circulant structure. 
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1 Introduction 

Let x 1 , . . . , x n £ M. d be n points in the ci-dimensional Euclidean space M. d . The classical Johnson- 
Lindenstrauss lemma tells that, for a given e S (0, and a natural number k = 0(e~~ 2 logn), there 
exists a linear map / : M. d — > M. k , such that 

(l-e)||^||i<||/(x*)||l<(l + e )||^||l 

for all j € {1, . . . , n}. 

Here || • H2 stands for the Euclidean norm in R or W k , respectively. Furthermore, here and any time 
later, the condition k = 0(e~ 2 logn) means, that there is an absolute constant C > 0, such that 
the statement holds for all natural numbers k with k > Ce _2 logn. We shall also always assume, 
that k < d. Otherwise, the statement becomes trivial. 

The original proof of this fact was given by Johnson and Lindenstrauss in [7] . We refer to [1] for 
a beautiful and self-contained proof. Since then, it has found many applications for example in 
algorithm design. These applications inspired numerous variants and improvements of the Johnson- 
Lindenstrauss lemma, which try to minimize the computational costs of f(x), the memory used, 
the number of random bits used and to simplify the algorithm to allow an easy implementation. 
We refer to [TJ, [2J O [S] for details and to [S] for a nice description of the history and the actual 
"state of the art". 

All the known proofs of the Johnson-Lindenstrauss lemma work with random matrices and proceed 
more or less in the following way. One considers a probability measure P on a some subset V of all 
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k x d matrices (i.e. all linear mappings M d —> The proof of the Johnson-Lindenstrauss lemma 
then emerges by some variant of the following two estimates 

P(/€P: ||/(^)||| >l + e) <1-^ 

and 

F(fer:\\f(x)\\l<l-e] <1"^, 

which have to be proven for all unit vectors x 6 M d , and a simple union bound over all points 
x J /\\x J \\2,j = 1, ■ ■ ■ ,n. Here and later on we assume, without loss of generality, that x 3 7^ for all 
j = l,...,n. 

The best known construction of / (according to the properties mentioned above) was given by 
Ailon and Chazelle in [2] with an improvement due to Matousek, cf. [9]. It states, that / may be 
given as a composition of a sparse matrix, certain random Fourier matrix and a random diagonal 
matrix. Although it provides a good computational time of f(x) (with high probability f{x) 
may be computed using 0(dlog d + min{de -2 logn, £~ 2 log 3 n}) operations), it still needs, that 
each coordinate of the k x d matrix is generated independently. In [5], we studied a different 
construction of /, namely the possibility of a composition of a random circulant matrix with a 
random diagonal matrix. As a multiple of a circulant matrix may be implemented with the help of 
a discrete Fourier transform, it provides the running time of 0(dlogd), requires less randomness 
(only 2d compared to kd or (k + l)d used earlier) and allows a very simple implementation, as the 
Fast Fourier Transform is a part of every standard mathematical software package. 

The main difference between this approach and all the other constructions available in the literature 
so far is that the components of f{x) are now no longer independent random variables. Decoupling 
this dependence, we were able to prove in [5] the Johnson-Lindenstrauss lemma for composition of 
a random circulant matrix and a random diagonal matrix, but only for k = 0(e~~ 2 log 3 n). It is the 
main aim of this note to improve this bound to k = 0(e~ 2 log 2 n). This comes essentially closer 
to the standard bound k = 0(e~ 2 logn). Reaching this optimal bound (and keeping the control of 
the constants involved) remains an open problem and a subject of a challenging research. 

We use a completely different technique here. We use the discrete Fourier transform and the singular 
value decomposition of circulant matrices. That is the reason, why we found it more instructive 
to state and prove our variant of Johnson-Lindenstrauss lemma for complex vectors and Gaussian 
random variables. As a corollary, we obtain of course a corresponding real version. 

To state our main result, we first fix some notation. Let 

• £€ (O.i), 

• n > d be natural numbers, 

• x 1 , . . . , x n G C d be n arbitrary points in C rf , 

• k = 0(e~ 2 log 2 n) be a natural number smaller then d, 

• a = (ao, • • • ,ad-i) be independent complex Gaussian variables, cf. Definition 12.11 

• x = (xo, . . . , Xd-i) be independent Bernoulli variables. 

We denote by M a ^ and D H the partial random circulant matrix and the random diagonal matrix, 
respectively, cf. Definition 12.21 for details. 
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Theorem 1.1. The mapping f : C d — > C k given by f(x) = —7=M a ^D x x satisfies 



2k 

(l-e)\W\\ 2 2 <\\f(x>)\\ 2 2 <(l + e)\W\\ 2 2 

for all j E {1, . . . ,n} with probability at least 2/3. Here \ \ ■ \\ 2 stands for the l 2 -norm in C d or C k , 
respectively. 

For reader's convenience, we formulate also a variant of Theorem 11.11 which deals with real Eu- 
clidean spaces. 

Corollary 1.2. Let e 6 (0, n > d be natural numbers, and let x 1 , . . . ,x n 6 M? d be n arbitrary 
points in M. 2d . Let ao, . . . , a^—i, Po, ■ ■ ■ > Pd—i be 2d independent real Gaussian variables and let 
k = (xo, . . . , >Cd-i) be independent Bernoulli variables. 

If k = 0(e~ 2 log 2 n) is a natural number, then the mapping f : M. 2d — > M. 2k given by 

m 



M a , k -M/3 >k \ (D» 



^2k \Mfs, k M a:k J V D H 

satisfies 

{\-e)\\x^ 2 <\\f^)\f 2 <(\ + e)\\x^\\ 

for all j € {1, . . . , n} with probability at least 2/3. Here \ \ ■ \ \ 2 stands for the £ 2 -norm in M? d or 
respectively. 

The proof follows trivially from Theorem 11.11 by considering complex Gaussian variables a = («o + 
if3 , . . . , otd-i + iftd-i) and complex vectors y j = (x^ + ix 3 d , . . . , + zx^f-i) G C d , j = 1, . . . , n. 

2 Used techniques 

Let us give a brief overview of techniques used in the proof of Theorem II. li We shall list only those 
few properties needed in the sequel. 

2.1 Discrete Fourier transform 

Our main tool in this note is the discrete Fourier transform. If d is a natural number, then the 
discrete Fourier transform '■ C rf — > C d is defined by 

1 ( 2Triu£ t \ 



(Fdx)(0 = -j=^2x u exp( — j. 



yd 

With this normalisation, Td is an isomorphism of <C d onto itself. The inverse discrete Fourier 
transform is given by 

Observe, that the matrix representation of J-^ 1 is the conjugate transpose of the matrix represen- 
tation of Td, i.e. T7 = F* d . 
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2.2 Circulant matrices 



Definition 2.1. Let a and f3 be independent real Gaussian random variables with 



Then we call 



Ea = E/3 = and E|a| 2 
a = a + ij3 



E|/3| : 



complex Gaussian variable. 

Let us note, that if a is a complex Gaussian variable, then 

Ea = Ea + iE/3 = and E|a| 2 = Ea 2 + E/3 2 = 2. 

Definition 2.2. (i) Let k < d be natural numbers. Let a = (ao, • • • , a<f-i) G C rf be a fixed complex 
vector. We denote by M a & the partial circulant matrix 



M, 



a,k 



( ao 


ai 
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■ a d _ 
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ad-i 


ao 


ai 


■ a d - 
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ad-2 


ad-i 


ao 


■ a d - 
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\ad-k+i 


ad-k+2 


ad-k+3 ■ 


■ a d - 
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ikxd 



If k = d, we denote by M a = M a ^ the full circulant matrix. This notation extends naturally to the 
case, when a = (ao, • • • ,ad-i) are independent complex Gaussian variables. 

(ii) If x = (xq, • • • , Xd-i) are independent Bernoulli variables, we put 



D x = diag(x) := 



(xo 
xi 



\ 




ndxd 



\0 ... Hd-l) 

Of course, D, x : C d — > C d is also an isomorphism. 

The fundamental connection between discrete Fourier transform and circulant matrices is given by 

M a = F d &mg{^dF d a)F d \ (2.1) 

which may be verified by direct calculation. Hence every circulant matrix may be diagonalised with 
the use of a discrete Fourier transform, its inverse and a multiple of the discrete Fourier transform 
of its first row. 



2.3 Singular value decomposition 

The last tool needed in the proof is the singular value decomposition. Let M : C d — > C k be a k x d 
complex matrix with k < d. Then there exists a decomposition 

M = UT.V*, 

where U is a k x k unitary complex matrix, £ is a k x k diagonal matrix with nonnegative entries 
on the diagonal, V is a d x k complex matrix with k orthonormal columns and V* denotes the 
conjugate transpose of V. Hence V* has k orthonormal rows. The entries of £ are the singular 
values of M, namely the square roots of the eigenvalues of MM*. 
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If a = (ao, • • • , Od-i) € C rf is a complex vector and M a is the corresponding circulant matrix, then 
its singular values may be calculated using (|2.ip . We obtain 

M a M* = F d d\z,g{\/dF d a)F d l [F d dmg{\fdF d a)F^ 1 }* = F^iag^VdF^dmgiVdF^F^ 1 
= J*ddiag(d| F d a\ 2 )F d l . 

Hence, the singular values of M a are {Vd\(F d a)(^)\}^ZQ. 

The action of an arbitrary projection onto a vector of independent real Gaussian variables is very 
well known. It may be described as follows. 

Lemma 2.3. Let a = (ao, • • • , a^-i) be independent real Gaussian variables. Let k < d be a natural 
number and let x 1 , . . . ,x k be mutually orthogonal unit vectors in M. d . Then 

{<a,z% fc =1 

is equidistributed with a k-dimensional vector of independent real Gaussian variables. 

A direct calculation shows, that Lemma 12.31 holds also for complex vectors a and x l , . . . ,x k . We 
present the following formulation of this fact. 

Lemma 2.4. Let a = (ao, • • • , a d -i) be independent complex Gaussian variables. Let W be a 
k x d matrix with k orthonormal rows. Then Wa is equidistributed with a k-dimensional vector of 
independent complex Gaussian variables. 



3 Proof of Theorem 11.11 

We shall need the following statement, which describes the preconditioning role of the diagonal 
matrix D K . A similar fact has been used also in [2]. Nevertheless, using discrete Fourier transform 
instead of a Hadamard matrix does not pose any restrictions on the underlying dimension d. 
Without repeating the details, we point out, that we discussed briefly in [5] Remark 2.5], why this 
preconditioning may not be omitted. 

Lemma 3.1. Let n > d be natural numbers and let x 1 ,...,x n G be complex vectors. Let 
k = (xo, . . . , Xd-i) be independent Bernoulli variables. Then there is an absolute constant C > 0, 
such that with probability at least 5/6 



I1^£U^)1U< -1H12 (3-1) 



holds for all j G {1, . . . , n}. 



Proof. Let x = a + i/3 be a unit complex vector in C d . We put y = (yo, . . . ,y d -i) = F d D >c {x). 
Then we may estimate 

Px(N>a)<2P x (»i/,>-^) + 2P x (9fi/,>-^), Z = 0,...,d-1, (3.2) 

where 

^ d-l 

$lyi = —=. K u[ct u cos(2irlu / d) + (3 U sin(27r/n/d)] 

and 

. d-l 

^syi = —= 2 Xu[Pu cos(2irlu/d) — a u sin(2-7r/u/d)] 
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are the real and the imaginary part of yi, respectively. 

Using the Markov's inequality and a real parameter t > 0, which is at our disposal, we may proceed 
in a standard way: 

> = P x (exp(i5fty, - -^|) > 1 
< exp( — ^=)l x exp(i%/) 



exp^— -y=^ cosh — ^[a u cos{2i:lu/d) + j3 u sm(2irlu/d)] 



< expf — — j Y[ ex p( 7Tj [a u cos(2ttIu / d) + j3 u sm(2-7rlu/d)] 

u=0 

st ^ ^ t 2 st t° 

* exp (-7f) II ex p(^ [a ' + ft) = exp ("7! + 2 



u=0 

We have used the inequality cosh(v) < exp(v 2 /2), which holds for all wel, and the inequality 
between geometric and quadratic means. For the optimal t = this is equal to exp(— 

As the second summand in ()3.2[) may be estimated in the same way, we obtain 

P*(|l/l|>s)<4exp(-^), l = Q,...,d-l. (3.3) 

Choosing s = 0(d~ 1 / 2 y/\ogn) and applying the union bound over all nd < n 2 components of 
{J r dD > t(x J /||x J ||2)}? =1 , we obtain the result. □ 



Proof of Theorem li.il 

Let us choose a vector x = (xq, . . . , Xd-i) € {—1, +l} d , such that (|3.ip holds. According to the 
Lemma 13. II this happens with probability at least 5/6. 

Let us take x = pTju f° r an y fixed j = 1, . . . , n. We show, that there is an absolute constant c > 0, 



such that 

F a (||M 0ifcj D x x||| > 2(1 + e)k) < ex p(-^) (3-4) 



a {\\M a>k D„x\\l<2(l-e)k) <exp (3.5) 



and 

F„CllM„ uD..x\\ 2 < 2C1 - plfcl < mmf- 

logn 

holds. From (|3.4|) and (|3.5p . Theorem 11.11 follows again by a union bound over all j = 1, . . . , n 
Let y- 7 = Si(Dxx) £ C d , j = 0, . . . , — 1, where 5 is the shift operator defined by 

S : C d -> C d , S(zo, . . . , = (zi, . . . , zq). 

We denote by V the k x d matrix with rows y°, . . . , y k ~ 1 - 
Then it holds 

fc-l d-i fc-1 d-l 

2_V^IV^„ . ~ |2_\^|\^.j_ |3_,| V _,|2 



j=0 u=0 j=Q u=0 



a 



2- 



Let y = UTiV* be the singular value decomposition of Y. As mentioned above, b := V*a is a 
/c-dimensional vector of independent complex Gaussian variables. Hence, 

fc-1 

||Fa||| = \\U2V*a\\l = \\UZb\\l = ||S6||| = J>f|6,f, 

j=0 
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where \j, j = 0, . . . , k — 1, are the singular values of Y. Let us denote fj,j = X 2 . Then 

fe-i 

3=0 

where ||V||f is the Frobenius norm of Y. 
Moreover, 

IHIoc = ||A||L= sup \\Yz\\l (3.6) 
zec d ,\\z\\ 2 <i 

< sup \\M D ^z\\ 2 2 = d\\T d D K (x)\\lo <C 2 logn, 
zec d ,\\z\\ 2 <i 

where Mp^x stands for the d x d complex circulant matrix with the first row equal to D x x. 

This leads finally also to 

1Mb < vlHIi ' IMloo < Cy/klogn. (3.7) 

Then 

P (||Fo||l>2(l + e)fc) =F b (j2» j (\b j \ 2 -2)>2ek\. 

S=o ' 

We denote 

fc-i 

Z:=Y J ^(H 2 -2). 

The complex version of Lemma 1 from Section 4.1 of [8] (cf. also Lemma 2.2 of [9]) states that 

F b (Z > 2>/2|H|2>/* + 2||/i|| 0O t) < exp(-i). (3.8) 
Using (|3,6p and ()3.7p . we arrive at 

P b (Z > 2\f2CyJtk log n + 2C 2 t log n) < exp(-i). 
Choosing t = g ^ ^ - for c' > small enough, we get 

eke 2 



logn 



P 6 (Z > 2ek) < exp 

This finishes the proof of (|3.4p . Let us note, that (|3.5p follows in the same manner with (j3.8|) 
replaced by 

P 6 (Z < -2V2||//|| 3 >/t) < exp(-t), 
which may be again found in Lemma 1, Section 4.1 of [8]. 

Remark 3.2. The statement and the proof of Theorem II .11 do not change, if we replace the partial 
circulant matrix M a ^ with any k x d submatrix of M a . 

Acknowledgement: The author would like to thank Aicke Hinrichs for valuable comments. The 
author also acknowledges the financial support provided by the FWF project Y 432-N15 START- 
Preis Sparse Approximation and Optimization in High Dimensions. 



7 



References 

[1] D. Achlioptas, Database-friendly random projections: Johnson-Lindenstrauss with binary 
coins. J. Comput. Syst. Sci., 66(4):671-687, 2003. 

[2] N. Ailon and B. Chazelle, Approximate nearest neighbors and the fast Johnson-Lindenstrauss 
transform. In Proc. 38th Annual ACM Symposium on Theory of Computing, 2006. 

[3] N. Ailon and B. Chazelle, The fast Johnson-Lindenstrauss transform and approximate nearest 
neighbors. SI AM J. Comput. 39 (1), 302-322, 2009. 

[4] S. Dasgupta and A. Gupta, An elementary proof of a theorem of Johnson and Lindenstrauss. 
Random. Struct. Algorithms, 22:60-65, 2003. 

[5] A. Hinrichs and J. Vybfral, Johnson-Lindenstrauss lemma for circulant matrices, submitted, 
available on http://arxiv.org/abs/1001.4919, 

[6] P. Indyk and R. Motwani, Approximate nearest neighbors: Towards removing the curse of 
dimensionality. In Proc. 30th Annual ACM Symposium on Theory of Computing, pp. 604-613, 
1998. 

[7] W. B. Johnson and J. Lindenstrauss, Extensions of Lipschitz mappings into a Hilbert space. 
Contem. Math., 26:189-206, 1984. 

[8] B. Laurent and P. Massart, Adaptive estimation of a quadratic functional by model selection. 
Ann. Statist. 28(5): 1302-1338, 2000. 

[9] J. Matousek, On variants of the Johnson-Lindenstrauss lemma, Random Struct. Algorithms 
33(2):142-156, 2008. 



8 



